You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR contains changes to stabiity filter for backward euler and also VSVO implementation for BDF based on the paper here
Add any other context about the problem here.
Sorry for the delay it took me a whie to fully understand the paper and make sure the implementation matches with the algorithm defined in the paper
@ChrisRackauckas : Please review and provide feedback. I did rebase this time too.
@oscardssmith@ChrisRackauckas
I did a benchmark test comparing MOOSE234 and FBDF mehtods. Even though MOOSE234 takes fewer steps on stiff problems, each step requires more Newton iterations (f-evals), so the total cost is higher. I tried an adaptive predictor degree approach but it did not close the gap. I'm investigating further and open to suggestions.
Looking into why MOOSE234 uses more f-evals than FBDF. Here's what came out of
the analysis.
How I got here
Started by running work-precision diagrams (error vs f-evals) on VdP, ROBER,
and HIRES comparing MOOSE234 against FBDF, QNDF, CVODE_BDF, Rodas5P, RadauIIA5,
and KenCarp4. The plots showed MOOSE234's curve consistently shifted to the right
of FBDF — same accuracy but more work.
Since both solvers use the exact same Newton solver (nlsolve!), same convergence
logic, same maxiters=10, same kappa=1/100, the difference had to come from what
each method feeds into Newton or does around it. I wrote a step-by-step
diagnostic (newton_diagnostics.jl) that hooks into init()/step!() and logs
per-step Newton iterations, f-evals, Jacobian updates, and order for both solvers.
That trace showed two clear patterns:
MOOSE234: nf per step = newton_iter + 1 (consistently)
FBDF: nf per step = newton_iter (consistently)
So every single step, MOOSE234 was spending one extra f-eval that FBDF wasn't.
Traced it to the f(fsallast) call at the end of perform_step!, then confirmed
that nothing reads that value (isfsal=false, interpolation uses Chebyshev nodes).
After fixing that, re-ran the work-precision plots and the per-step traces
across all four problems to confirm the improvement and understand what gap
remains.
What was wrong
MOOSE234 was calling f(fsallast) after every step that was redundant.
FBDF skips this by recovering f(u) algebraically from the BDF equation.
Removed the dead call — saves exactly 1 f-eval per accepted step.
Numbers (MOOSE234 after fix vs FBDF)
Column key:
tol = solver tolerance (abstol = reltol = this value)
MOOSE234 = total function evaluations (nf) for MOOSE234
FBDF = total function evaluations (nf) for FBDF
Ratio = MOOSE234 nf / FBDF nf (lower is better for MOOSE234)
M steps = accepted steps taken by MOOSE234
F steps = accepted steps taken by FBDF
VdP (mu=1000):
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 43 32 1.34x 24 21
1e-6 82 56 1.46x 37 43
1e-8 213 99 2.15x 71 75
1e-10 246 175 1.41x 194 156
(Before the fix, VdP at 1e-6 was 122 nf. Now 82. That's a 33% drop.)
ROBER:
tol MOOSE234 FBDF Ratio M steps F steps
1e-6 1142 569 2.01x 193 217
1e-8 5778 958 6.03x 785 495
1e-10 5128 1899 2.70x 3527 1184
HIRES (8 eqns):
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 406 420 0.97x 47 85
1e-6 996 618 1.61x 159 169
1e-8 2017 1064 1.90x 640 385
1e-10 7210 2097 3.44x 2923 854
Linear stiff:
tol MOOSE234 FBDF Ratio M steps F steps
1e-4 128 80 1.60x 41 55
1e-6 208 135 1.54x 116 102
1e-8 525 299 1.76x 438 224
1e-10 2009 675 2.98x 1918 503
Why MOOSE234 is still more expensive after the fix
Three things:
FBDF goes up to order 5, MOOSE234 caps at 4. At tight tolerances this really
shows — FBDF takes way fewer steps (ROBER at 1e-10: 1184 steps vs 3527).
On VdP at loose tolerances, MOOSE234 gets stuck at order 2. At tol=1e-6 it's
order 2 for 97% of steps. The other problems are fine — ROBER/HIRES/Linear
all reach order 4 without issues. The problem is the order controller needs
est3 < est2 to try order 4, and on VdP the FBDF4 filter correction is always
bigger than the BDF3-Stab correction, so it never tries.
Order 3 is basically never picked on any problem either — the method jumps
straight from 2 to 4:
Problem Ord 2 Ord 3 Ord 4 (at tol=1e-6)
VdP 97% 0% 3%
ROBER 10% 0% 90%
HIRES 16% 1% 83%
Linear 31% 1% 68%
More Newton iterations in VdP's transient region. MOOSE234 has 5 steps needing
4-6 Newton iters (t < 0.005) vs FBDF's 1 step with 3 iters. Being stuck at
order 2 means a worse predictor, so Newton has to work harder.
What I am looking at next
Compute Est4 properly so the order controller can pick order 4 on VdP. The
paper has some details I want to review.
Figure out why order 3 is never selected. Might be a problem with how the
error estimators are set up.
MOOSE234 rejects a lot more steps than FBDF (HIRES at 1e-10: 109 vs 21).
The step size controller might be too aggressive.
Work-precision plots (updated with fix)
Re-ran the full benchmark suite (7 solvers, tolerances 1e-4 to 1e-10) on VdP,
ROBER, and HIRES after the fix. Plots are in test/benchmark_results/.
VdP: MOOSE234 tracks close to FBDF at loose tolerances but falls behind at tight
tolerances where FBDF's order-5 capability kicks in. CVODE_BDF and Radau are the
most efficient on this problem overall.
ROBER: MOOSE234 is the most expensive solver across the board. Its curve is far
to the right of everything else, especially at tight tolerances. This is the
worst-case problem for MOOSE234.
HIRES: MOOSE234 is competitive at loose tolerances — its curve overlaps with
FBDF and QNDF around 1e-4 to 1e-5 error. The gap grows at tight tolerances
but stays smaller than on ROBER. This is the closest to a "win" for MOOSE234
among the problems tested.
Across all three problems, the nf-based work-precision plots confirm that the
f(fsallast) fix helped (curves shifted left compared to pre-fix runs) but
MOOSE234 still has structural disadvantages vs FBDF from the order cap and
order controller behavior described above.
Since it is a variable order, u_history might be corrupted with incorrect values causing estimate for est3 and est2 to be error prone. I am looking at storing the raw values and computing everytime.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Checklist
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Additional context
This PR contains changes to stabiity filter for backward euler and also VSVO implementation for BDF based on the paper here
Add any other context about the problem here.
Sorry for the delay it took me a whie to fully understand the paper and make sure the implementation matches with the algorithm defined in the paper
@ChrisRackauckas : Please review and provide feedback. I did rebase this time too.